Goto

Collaborating Authors

 hazard analysis


Systematic Hazard Analysis for Frontier AI using STPA

Mylius, Simon

arXiv.org Artificial Intelligence

All of the frontier AI companies have published safety frameworks where they define capability thresholds and risk mitigations that determine how they will safely develop and deploy their models. Adoption of systematic approaches to risk modelling, based on established practices used in safety-critical industries, has been recommended, however frontier AI companies currently do not describe in detail any structured approach to identifying and analysing hazards. STPA (Systems-Theoretic Process Analysis) is a systematic methodology for identifying how complex systems can become unsafe, leading to hazards. It achieves this by mapping out controllers and controlled processes then analysing their interactions and feedback loops to understand how harmful outcomes could occur (Leveson & Thomas, 2018). We evaluate STPA's ability to broaden the scope, improve traceability and strengthen the robustness of safety assurance for frontier AI systems. Applying STPA to the threat model and scenario described in 'A Sketch of an AI Control Safety Case' (Korbak et al., 2025), we derive a list of Unsafe Control Actions. From these we select a subset and explore the Loss Scenarios that lead to them if left unmitigated. We find that STPA is able to identify causal factors that may be missed by unstructured hazard analysis methodologies thereby improving robustness. We suggest STPA could increase the safety assurance of frontier AI when used to complement or check coverage of existing AI governance techniques including capability thresholds, model evaluations and emergency procedures. The application of a systematic methodology supports scalability by increasing the proportion of the analysis that could be conducted by LLMs, reducing the burden on human domain experts.


From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems

Hong, Yining, Timperley, Christopher S., Kästner, Christian

arXiv.org Artificial Intelligence

Machine learning (ML) components are increasingly integrated into software products, yet their complexity and inherent uncertainty often lead to unintended and hazardous consequences, both for individuals and society at large. Despite these risks, practitioners seldom adopt proactive approaches to anticipate and mitigate hazards before they occur. Traditional safety engineering approaches, such as Failure Mode and Effects Analysis (FMEA) and System Theoretic Process Analysis (STPA), offer systematic frameworks for early risk identification but are rarely adopted. This position paper advocates for integrating hazard analysis into the development of any ML-powered software product and calls for greater support to make this process accessible to developers. By using large language models (LLMs) to partially automate a modified STPA process with human oversight at critical steps, we expect to address two key challenges: the heavy dependency on highly experienced safety engineering experts, and the time-consuming, labor-intensive nature of traditional hazard analysis, which often impedes its integration into real-world development workflows. We illustrate our approach with a running example, demonstrating that many seemingly unanticipated issues can, in fact, be anticipated.


Can Large Language Models assist in Hazard Analysis?

Diemert, Simon, Weber, Jens H

arXiv.org Artificial Intelligence

Large Language Models (LLMs), such as GPT-3, have demonstrated remarkable natural language processing and generation capabilities and have been applied to a variety tasks, such as source code generation. This paper explores the potential of integrating LLMs in the hazard analysis for safety-critical systems, a process which we refer to as co-hazard analysis (CoHA). In CoHA, a human analyst interacts with an LLM via a context-aware chat session and uses the responses to support elicitation of possible hazard causes. In this experiment, we explore CoHA with three increasingly complex versions of a simple system, using Open AI's ChatGPT service. The quality of ChatGPT's responses were systematically assessed to determine the feasibility of CoHA given the current state of LLM technology. The results suggest that LLMs may be useful for supporting human analysts performing hazard analysis.


Hazard Analysis of Collaborative Automation Systems: A Two-layer Approach based on Supervisory Control and Simulation

Huck, Tom P., Selvaraj, Yuvaraj, Cronrath, Constantin, Ledermann, Christoph, Fabian, Martin, Lennartson, Bengt, Kröger, Torsten

arXiv.org Artificial Intelligence

Safety critical systems are typically subjected to hazard analysis before commissioning to identify and analyse potentially hazardous system states that may arise during operation. Currently, hazard analysis is mainly based on human reasoning, past experiences, and simple tools such as checklists and spreadsheets. Increasing system complexity makes such approaches decreasingly suitable. Furthermore, testing-based hazard analysis is often not suitable due to high costs or dangers of physical faults. A remedy for this are model-based hazard analysis methods, which either rely on formal models or on simulation models, each with their own benefits and drawbacks. This paper proposes a two-layer approach that combines the benefits of exhaustive analysis using formal methods with detailed analysis using simulation. Unsafe behaviours that lead to unsafe states are first synthesised from a formal model of the system using Supervisory Control Theory. The result is then input to the simulation where detailed analyses using domain-specific risk metrics are performed. Though the presented approach is generally applicable, this paper demonstrates the benefits of the approach on an industrial human-robot collaboration system.


Automated causal inference in application to randomized controlled clinical trials

Wu, Jiqing, Horeweg, Nanda, de Bruyn, Marco, Nout, Remi A., Jürgenliemk-Schulz, Ina M., Lutgens, Ludy C. H. W., Jobsen, Jan J., van der Steen-Banasik, Elzbieta M., Nijman, Hans W., Smit, Vincent T. H. B. M., Bosse, Tjalling, Creutzberg, Carien L., Koelzer, Viktor H.

arXiv.org Artificial Intelligence

Randomized controlled trials (RCTs) are considered as the gold standard for testing causal hypotheses in the clinical domain. However, the investigation of prognostic variables of patient outcome in a hypothesized cause-effect route is not feasible using standard statistical methods. Here, we propose a new automated causal inference method (AutoCI) built upon the invariant causal prediction (ICP) framework for the causal re-interpretation of clinical trial data. Compared to existing methods, we show that the proposed AutoCI allows to efficiently determine the causal variables with a clear differentiation on two real-world RCTs of endometrial cancer patients with mature outcome and extensive clinicopathological and molecular data. This is achieved via suppressing the causal probability of non-causal variables by a wide margin. In ablation studies, we further demonstrate that the assignment of causal probabilities by AutoCI remain consistent in the presence of confounders. In conclusion, these results confirm the robustness and feasibility of AutoCI for future applications in real-world clinical analysis.